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Abstract: 

We study the folding thermodynamics of a /3-hairpin and two three-stranded /9-sheet 
peptides using a simplified sequence-based all-atom model, in which folding is driven 
mainly by backbone hydrogen bonding and effective hydrophobic attraction. The 
native populations obtained for these three sequences are in good agreement with 
experimental data. We also show that the apparent native population depends on 
which observable is studied; the hydrophobicity energy and the number of native 
hydrogen bonds give different results. The magnitude of this dependence matches 
well with the results obtained in two different experiments on the /5-hairpin. 
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1 Introduction 



Peptide folding is currently attracting considerable attention. Recent advances in 
this area include the de novo design of two monomeric three-stranded antiparallel 
/3-sheet peptides, Betanova [1112] and Beta3s. Peptides that have the ability to 
fold on their own and are well characterized experimentally are valuable not least as 
a testbed for theoretical models for protein folding. /3-sheet peptides are particularly 
interesting in this respect, as /^-sheet formation is more challenging to model than 
a-helix formation. Therefore, it is no surprise that both Betanova ^4.^5j and BetaSs ^ 
have become the subject of computational studies. Simulations of peptide sequences 
that are somewhat similar to these and occur in natural proteins, so-called WW 
domains, have been reported, too. [Zj For a recent review of computational studies 
of peptide folding, see Granakaran et al. [H] 

Here we present a study of the C-terminal /3-hairpin from the protein G Bl domain 
and a triple mutant of Betanova called LLM. j2] The original Betanova, which is 
less stable than the peptide LLM, P| is considered too. These different sequences 
are studied using an all-atom model with a simplified interaction potential. An 
earlier version of this model was tested jH] on the same /?-hairpin and an a-helix, the 
designed so-called Fg. [TU1ITT| The model was able to fold these two sequences and the 
folded population showed, in both temperature dependence comparable with 

experimental data. It should be pointed out that different sequences are studied using 
exactly the same parameters; the interaction potential is, like that of Kussell et al. [12] 
but unlike many other simplified potentials for protein folding, entirely sequence- 
based. This is of importance even if only one sequence is studied, because it ensures 
that the formation and breaking of non-native bonds is not a neglected part of the 
dynamics. 



2 Materials and Methods 



The model we study is a revised version of an earlier model. [0] It contains all atoms 
of the polypeptide chain, including hydrogens, but no explicit water molecules. All 
bond lengths, bond angles and peptide torsion angles (180°) are held fixed, so each 
amino acid has the Ramachandran torsion angles 0, if) and a number of side-chain 
torsion angles as its degrees of freedom (for Pro, (f) is held fixed at —65°). All bond 
lengths and bond angles are the same as in the original model. [H] 
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The potential function 

E = Eev + Eioc + Ehp + i?hb (1) 

is composed of four terms. The remaining part of this section describes these different 
terms, with emphasis on what is new compared with the earher model. Energy 
parameters are quoted in dimensionless units. To set the energy scale of the model, 
we use the midpoint temperature for the /5-hairpin as determined by Munoz et al., ^H] 
Tm = 297 K, which corresponds to kT ^ 0.440 in the model. 



The first term in Eq. Q E^^, represents excluded- volume effects and has the form 



Eev ^ev ^ ^ 



Xijiai + aj 



1 12 



(2) 



where Kev = 0.10 and ai = 1.77, 1.75, 1.55, 1.42 and 1.00 A for S, C, N, O and H 
atoms, respectively. The role of the parameter Xij is to reduce the repulsion between 
non-local pairs; Xij = 1 for all pairs connected by three covalent bonds and Ajj = 0.75 
otherwise. The reason for using < 1 for non-local pairs is partly computational 
efficiency, and partly the restricted flexibility of chains with only torsional degrees of 
freedom. To speed up the calculations, the sum in Eq. |21 is evaluated using a cutoff 
of r^j = A.SXij A. 

The second interaction term, E\oc, is new compared with the earlier model. By 
introducing this term and modifying (Xj for C and N, we slightly adjusted the shape 
of the Ramachandran 0, ip distribution, i^^ioc is a local electrostatic energy given by 

-E'loc = ft^ioc ^Pi \Y1 ~TIT7Y ' '^^^ 

where the outer sum runs over all non-Pro amino acids along the chain, and the inner 
sum represents the interaction between the partial charges of the backbone NH and 
CO groups within one amino acid (the sum has four terms: NC, NO, HC and HO). 
The partial charges are qi = ±0.20 for H and N and qi = ±0.42 for C and O. ^3] 
We put Kioc = 125, which corresponds to a dielectric constant of ^ 2.0 if pi = 1. 
The factor pi reduces the interaction strength for the two end amino acids and Gly, 
which can be viewed as a simple form of context dependence; pi = 0.25 for end amino 
acids. Pi = 0.5 for Gly, and pi = I otherwise. A similar factor is used for i?hb (see 
below) . 

The third term in Eq. ^ i?hp, is an effective attraction between hydrophobic side 
chains that are not nearest or next-nearest neighbors along the chain. It has the 
pairwise additive form 

E^p = -Y,MjjCij, (4) 
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I II III 



I Ala 

II He, Leu, Met, Val 
III Phe, Trp, Tyr 



0.0 0.1 0.1 
0.9 2.8 
3.2 



Table 1: The interaction matrix Mjj (see Eq. All amino acid pairs not occurring 
in the table have Mjj = 0. 



where Cjj is a measure of the degree of contact between side chains / and J, and Mjj 
sets the energy that a pair in contact gets. The contact measure Cjj is a number be- 
tween and 1, defined as before. The interaction matrix Mjj is given in Table I and 
differs from that used in our earlier study, which was based on the Miyazawa-Jernigan 
contact energies. ^Sj With an all-atom representation, this cannot be expected to be 
a good choice for more general sequences, since the Miyazawa-Jernigan contact ener- 
gies were derived using a different, reduced chain representation. ^Hj The new matrix 
Mjj has a simplified structure in that the hydrophobic amino acids are grouped into 
three classes (see Table I). The Mjj values are taken to be large for the aromatic 
class (Phe, Trp, Tyr), which in part is an attempt to compensate for the fact that it 
is relatively difficult for these large side chains with few degrees of freedom to make 
proper contacts. 

The last term of the potential, the hydrogen-bond energy E^y,, is given by 



where the two terms represent backbone-backbone interactions and interactions be- 
tween the backbone and charged side chains, respectively. The second term in Eq. El 
does not include any side chain-side chain interactions, as it did in our earlier study. 
Apart from that, the only difference compared with the earlier model is the factor 
pij, which like pi in Eq. El can be seen as a simple form of context dependence. We 
put Pij = 0.25 if any of the two amino acids involved is an end amino acid, pij = 0.5 
if any of them is a Gly, and pij = 1 otherwise. The constants e^^ =3.1 and e^^^"* = 2.0 
as well as the functions u and v are exactly the same as before. jUj 

To study the thermodynamic behavior of this model, we use simulated tempering, 
IT7j in which the temperature is a dynamical variable. Details on our implementation 
of this method can be found elsewhere. For a review of simulated tempering 
and other generalized-ensemble techniques for protein folding, see Hansmann and 
Okamoto. [TU] Eight different temperatures are studied, ranging from 284 K to 371 K. 
For the backbone degrees of freedom, we use three different elementary moves: first. 




^ Piju{rij)v{aij,P,j) + e^^^ ^ Piju{rij)v{a,j, Pij) , 



(5) 



bb— bb sc— bb 
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the pivot move in which a single torsion angle is turned; second, a semi-local 
method j2I] that works with up to eight adjacent torsion angles, which are turned 
in a coordinated way; and third, a symmetry-based update of three randomly chosen 
backbone torsion angles. [H] For the side-chain degrees of freedom, we use simple 
Metropolis updates of individual angles. 

For each peptide, eight independent Monte Carlo runs were performed, starting from 
random conformations. Each run required a few days on a standard desktop com- 
puter, and contained several folding/unfolding events. The similarity between the 
results from the different runs strongly suggest that the simulations did map out all 
relevant free-energy minima of the model. All statistical errors quoted are la errors 
obtained from the variance between the runs. The fits of data discussed in the next 
section were carried out by using a Levenberg-Marquardt procedure. j22I 

For a given protein structure, there generally exist alternative structures with similar 
secondary-structure content but different overall topologies. This holds true even 
for a small /5-hairpin, for which a flip of the side chains gives rise to a topologically 
distinct structure. To make models discriminate between different topologies is a 
delicate task. To assess whether or not a model is able to do that, it is necessary to 
make a suitable choice of observables. In our calculations, we monitor two variables 
that can be used for this purpose: first, the root-mean-square deviation (rmsd) from 
the folded structure. A, calculated over all non-H atoms (a backbone rmsd is much 
less informative); and second, the number of native backbone-backbone hydrogen 
bonds, iVj^b*- Figure 1 illustrates which hydrogen bonds we take to be present in 
the native states of the peptides studied. In our calculations, a hydrogen bond is 
considered formed if the energy is less than — ef^^ /3 (see Eq. E}. 

Using the original model, we studied the a-helical Fg peptide and a /3-hairpin. [5] 
Here, we study the same /5-hairpin and two three-stranded /3-sheet peptides, LLM 
and Betanova. Before turning to these results, it should be pointed out that the 
Fg sequence still makes an a-helix in the revised model, as can be seen from the 
free energy F{A, E) in Fig. 2. -F(A, E) has a pronounced, dominating minimum at 
A 2-3.5 A, which corresponds to a-helix. In addition, there are weakly populated 
minima corresponding to /3-sheet structures at A 9A and A ^ 12A. 
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Figure 1: Schematic illustration of the backbone-backbone hydrogen bonds taken as 
native for (a) the C-terminal /3-hairpin from the protein G Bl domain, [SBIIEI and 
(b) the mutant LLM of Betanova. pi Diagram (b) is used for the original Betanova, 
too (with L12 and M17 replaced by N12 and T17, respectively). 
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Figure 2: Free energy F{A,E) for Fg at T = 284 K, where A denotes heavy-atom 
rmsd from an ideal a-helix and E is energy. The contours are spaced at intervals of 
1 kT and dark tone corresponds to low free energy. Contours more than 6 kT above 
the minimum free energy are not shown. 
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Figure 3: The temperature dependence of (a) the hydrophobicity energy i?hp and 
(b) the number of native hydrogen bonds, A^hb*? for the /5-hairpin. The hne in (a) 
is a fit to the two-state expression _Ehp(T) = {E^^ + E^pK{T))/{l + K(T)), where 
K{T) = P''{T)/P''{T), P'^{T) and P"(T) being the populations of the native and 
unfolded states, respectively. The effective equilibrium constant K{T) is assumed to 
have the first-order form K{T) = exp[{l/kT — l/kT^)AE], where is the midpoint 
temperature and AE is the energy difference between the two states. The baselines 
E^p and E^^ are taken as constants. 

3 Results and Discussion 



Using the model described in the previous section, we first study the 16-amino acid 
C-terminal /5-hairpin from the protein G Bl domain. An important quantity moni- 
tored in our earlier study of this peptide [0] was the hydrophobicity energy E^p. This 
variable should be strongly correlated with Trp fluorescence, which Munoz et al. ^H] 
used to characterize the melting behavior of this peptide. The temperature depen- 
dence of i^hp was found to be in reasonable agreement with the data of Munoz et 
al. Several other groups have performed atomic simulations of the same /3-hairpin, 
with 121112312312111211 or without [121121112111201 explicit water. In contrast to ours, 
most models seem to require further calibration in order not to show a temperature 
dependence much weaker than that of experimental data. 

Figure 3a shows the temperature dependence of i?hp in the revised model. The line is 
a fit of the data to a simple (first-order) two-state expression. The parameters of the 
fit are the midpoint temperature Tm, the energy difference AE, and two baselines. 
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We use the parameter to set the energy scale of the model; this parameter is 
taken as Tm = 297 K as determined by Muiioz et al. ^3] For the energy difference, we 
then obtain /\E = 13.1 kcal/mol. These values of the two-state parameters Tm and 
AE correspond to a native population of 74% at T = 284 K, which agrees well with 
the result of Mufioz et al, 72% at T = 284 K. ^ The NMR analysis of Blanco et 
al. j2Sl gave, by contrast, a lower native population, 42% at T = 278 K. A possible 
explanation of this discrepancy would be that this peptide does not show a clear two- 
state behavior; the apparent native population may then very well depend on which 
quantity is studied. At first glance, this explanation may seem unlikely, given that 
the temperature dependence of the Trp fluorescence data to a good approximation 
showed two-state character. [T3] Let us therefore stress that, despite that the two- 
state fit in Fig. 3a looks quite good, this sequence does not show ideal two-state 
behavior in our model. This can be seen, for example, from the energy distribution, 
which lacks a clear bimodal shape. This was shown in our earlier study, |^ and 
holds true in the revised model as well. Similar results have also been obtained in 
simulations of a designed, fast-folding three-helix-bundle protein. [31] 

In Fig. 3b we show the temperature dependence of the number of native hydrogen 
bonds, A^hb*, which we expect to be more strongly correlated than E-^p with the 
NMR measurements of Blanco et al. For A^^b*' ^ two-state fit is not meaningful; for 
that, further data at lower temperatures would be needed. On the other hand, the 
quantity A^hb* can be used as a direct measure of nativeness. Based on inspection of 
many examples, we use as a criterion for nativeness that at most two of the native 
hydrogen bonds should be missing, which can be used for the two other peptides too 
(see below). For the /?-hairpin with seven native hydrogen bonds (see Fig. la), this 
criterion (Aj^^* — ^) gives a native population of 39% at T = 284 K. This value is 
close to the estimate of Blanco et al., [23j 42% at T = 278 K. Due to uncertainties 
about the precise definitions of nativeness and of when a hydrogen bond is formed, 
this agreement could be somewhat accidental. There is no doubt, however, that the 
native population obtained using A^^b* is significantly lower than that obtained above 
using Eiip. Figure 4 shows the probability distributions of A^hh* at T = 284 K and 
T = 306 K. The number of native hydrogen bonds is seen to rapidly decrease with 
increasing T, as it should. 

The two-state parameter AE extracted from the i?hp data is somewhat smaller here, 
AE = 13.1 kcal/mol, than it was in our earlier study, AE = 16.1 kcal/mol. P The 
reason for this is not so much that the model has changed, but rather that the fits 
were done in different ways. In our previous study, was held fixed at the specific 
heat maximum. Here, following the analysis of Munoz et al. more closely, we take 
Tm to be a parameter of the fit. The fitted value of turns out to lie slightly below 
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Figure 4: Histogram of the number of native hydrogen bonds, 
(full line) and T = 306 K (dashed line) for the /3-hairpin. 
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(1-2%) the specific heat maximum. Our new analysis improves the agreement with 
the result of Munoz et ai, which was AE = 11.6 kcal/mol. [THj 

Although the precise shape of the structures with lowest energy is sensitive to the 
details of the model, it is also interesting to make an rmsd-based comparison with 
experimental data. For this purpose, we use the NMR structure for the full protein G 
Bl domain (PDB code IGBl, first model), [H^ as the NMR restraints for the isolated 
/3-hairpin were insufficient to determine a unique structure. Figure 5a shows the free 
energy F{A,E) calculated as a function of rmsd. A, and energy, at T = 284 K. 
Three distinct, highly populated minima can be seen. The two minima with lowest 
E are found at A 2.0A and A ^ 3.lA, respectively. Both these correspond to 
/3-hairpin structures with a high N{^^^. That N{^^^ is high implies, in particular, 
that the topology of the /5-hairpin is the native one. The main difference between 
these two minima lies in the shape of the turn. The third minimum, at A f^i 4. OA, 
is somewhat higher in E than the first two. This minimum is also dominated by 
/3-hairpin structures with the native topology and many hydrogen bonds, but the 
two strands tend to be out of register with each other, so A^hb* is low. Largely, it 
is the existence of this third minimum that makes the apparent native population 
depend on which of the observables i?hp and A^hb* we use. Finally, there are also two 
weakly populated free-energy minima corresponding to /3-sheet structures with the 
non-native topology (A ^ 5.3 A) and a-helix (A ^ 8-10 A), respectively. 
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Figure 5: Free energy F{A, £") at T = 284 K for (a) the /?-hairpin and (b) the peptide 
LLM. E is energy and A is a heavy-atom rmsd, calculated using all the 16 amino 
acids for the /5-hairpin and amino acids 3-18 for LLM. The first two and last two 
amino acids of LLM do not take part in the /3-sheet structure. The contour levels 
are as in Fig. 2. 



3.2 Three- Stranded /3-Sheets 



The de novo design of the 20-amino acid three-stranded antiparallel /3-sheet peptide 
Betanova was reported in 1998. [1^ Recently, mutants of this peptide with higher 
stability were created by Lopez de la Paz et al. j2] Among the most stable mutants 
found was the triple mutant LLM (Val5Leu, Asnl2Leu, Thrl7Met). The peptide 
LLM and the original Betanova were estimated j2] to have native populations of 36% 
and 9%, respectively, at T = 283 K, based on NMR data. Melting curves have, as far 
as we know, not been reported for these peptides. 

Our simulations of LLM show first of all that this sequence does make a three- 
stranded antiparallel /3-sheet in this model. This can be seen from Fig. 5b, which 
shows the free energy -F(A, E) sXT = 284 K. The free energy has a broad minimum 
at A ?5i 3-5 A, corresponding to /3-sheet structures with the native topology and a 
high A^hb*- The shape of the /3-sheet varies within the minimum. At A 3.4 A, 
where the free energy is lowest, the /3-sheet has a bent shape, which enables the 
chain to make strong hydrophobic contacts. At A ?5i 4.5 A, the /3-sheet tends to 
be much flatter, which is hydrophobically disfavored but makes it possible for the 
chain to form more perfect hydrogen bonds. There is also a free-energy minimum at 
A ?5i 6.5 A, which corresponds to three-stranded antiparallel /3-sheet structures with 
the non-native topology. However, the native topology is the thermodynamically 
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favored one. Note that the native and non-native topologies exhibit non-overlapping 
sets of backbone-backbone hydrogen bonds, so N{^§^ is low at the A 6.5 A minimum. 

The main reason why the model favors the native topology over the non-native one 
lies in the side-chain orientations for the hydrophobic pairs Trp3-Leul2 and Leu5- 
TyrlO. The Cq-C/j vectors of these pairs point inwards in the non-native topology, 
which makes it difficult to achieve proper contacts between the side chains. This 
is much easier to accomplish in the native topology, where the Cq,-C/3 vectors point 
outwards. Interestingly, the situation is similar for the /3-hairpin above. jH] The (3- 
hairpin also has two pairs of hydrophobic side chains that are 'bow-legged' in the 
native topology and 'knock-kneed' in the non-native one. 

Next we estimate the native population for LLM. As we want to compare with the 
NMR-based results of Lopez de la Paz et ai, we consider A^^^* rather than i?hp. 
Figure 6a shows the N^^^ distribution at T = 284 K. In addition to the native and 
non-native peaks at high and low A^^b*' respectively, this distribution exhibits a third 
peak at A^^^^* = 4. The typical conformation at this peak contains only the first of 
the two native /9-turns (see Fig. lb). The second /?-turn is less stable, as will be 
discussed below. Using the criterion that at most two native hydrogen bonds should 
be missing (A^^^* > 6), we obtain a native population of 38% at T = 284 K for LLM, 
which agrees well with the result of Lopez de la Paz et al, |2^ 36% at T = 283 K. 
We also performed simulations of the original Betanova, and Fig. 6a shows the result 
for this sequence too. From this figure it is evident that Betanova is less stable than 
LLM. The probability that A"^^* > 6 is 14% for Betanova at T = 284 K, which means 
that this criterion gives a native population close to the NMR-based result of Lopez 
de la Paz et al. not only for LLM but also for Betanova. 

That the model predicts LLM to be more stable than Betanova is not surprising 
because LLM has a more pronounced hydrophobic core. The agreement with exper- 
imental data is, nevertheless, remarkably good, especially since these calculations do 
not involve any adjustable parameter; the energy scale of the model is fixed using 
melting data for the /3-hairpin and is then left unchanged. 

Figure 6b shows the frequencies of occurrence for the different native hydrogen bonds 
(see Fig. lb) for LLM and Betanova. For Betanova, there is a clear difference between 
the hydrogen bonds involved in the first /3-turn (1-4) and those involved in the second 
/3-turn (5-8). The latter four occur infrequently, showing that the second /5-turn is 
quite unstable, which is in line with the conclusions of Lopez de la Paz et al. |2^ For 
LLM, the difference in stability between the two /5-turns is less pronounced. However, 
hydrogen bond 7, which connects Metl7 to TyrlO (see Fig. lb), is quite unstable. The 
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Figure 6: (a) Histogram of tlie number of native liydrogen bonds, N^^^, at T = 284 K 
for LLM (full line) and Betanova (dashed line), (b) The frequency of occurrence for 
the eight native hydrogen bonds (labeled according to Fig. lb) for LLM (o) and 
Betanova (□) at T = 284 K. 



reason for this is that the side chain of Met 17 can make better contacts with other 
hydrophobic side chains if the strand is slightly bent. This bend makes it difficult for 
hydrogen bond 7 to form. 



Finally, in Fig. 7 we show the temperature dependence of £^hp and N^^'^ for LLM. As 
in the /3-hairpin case, we find that a simple two-state fit provides a good description of 
the data for i?hp. The fitted values of the parameters Tm and AE are Tm = 303 K and 
AE = 13.0 kcal/mol, which means that the native population obtained from this fit 
is significantly higher than that obtained from the iVhb* distribution (see Fig. 6a). So, 
the model predicts that the apparent native population depends on which observable 
is used for this sequence, too. We arc not aware of any existing experimental data 
that support, or refute, this conclusion for LLM. 
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Figure 7: The temperature dependence of (a) the hydrophobicity energy i?hp and 
(b) the number of native hydrogen bonds, A^hb*? for the LLM peptide. The hne in 
(a) is a first-order two-state fit (as in Fig. 3). 



4 Conclusion 



Using a novel all-atom model with a simplified sequence-based potential, we have 
investigated the equilibrium behaviors of three /3-sheet peptides. We determined 
native populations for these peptides in two ways, from the distribution of the num- 
ber of native hydrogen bonds (A^hb*) cind from the temperature dependence of the 
hydrophobicity energy (-Ehp)- These two estimates were compared with experimen- 
tal results based on NMR and Trp fluorescence, respectively. This comparison is 
summarized in Table II. The agreement with experimental data is good, which in 
particular means that the model to a good approximation is able to reproduce the 
relative stabilities of these three peptides, as obtained from the NMR measurements. 
In line with the experimental results on the /3-hairpin, we find that the apparent 
native population depends on whether we use N^^^ or E'hp. This reflects the fact 
that the melting transition is not a clear two-state transition in our model (for any of 
these three sequences). It is also worth noting that, despite that the two-state picture 
is an oversimplification, the temperature dependence of i?hp is quite well described 
by a simple two-state expression (see Figs. 3a and 7a). Computational studies of 
the /3-hairpin have also been performed by many other groups, but the temperature 
dependence obtained was typically too weak, as has been pointed out by Zhou et 
al. j27j Our model shows a temperature dependence which is in good agreement with 
experimental data. 

Our study of these three different peptides was carried out using one and the same 
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Model, 284 K 



Experiment 



NMR Trp fluorescence 



/3-liairpin 39% 74% 
LLM 38% 
Betanova 14% 



42%, 278 K [23] 72%, 284 K yj] 
36%, 283 K 
9%, 283 K 



Table 2: Summary of apparent native populations obtained from simulations and 
experimental data, respectively (see the text). The model results have statistical 
errors of 1-4%. 



set of parameters. In addition, we showed that the Fg peptide makes an a-helix for 
this choice of parameters. While these results are very encouraging, it is important to 
stress that we do not expect the model to be directly apphcable to other sequences. 
However, by confronting the model with new sequences, we hope it will be possible to 
refine the potential, and thereby further extend its applicability. The present study 
was a first step in this direction, in which the model was improved by studying LLM 
and Betanova. To make the model able to fold these sequences, many changes were 
made, several of which were minor. The two perhaps most important changes were 
the replacement of the old hydrophobicity matrix (M/j), and the introduction of 
a simple form of context dependence for the hydrogen bonds. Whether it will be 
possible to carry on this process to a point where the model correctly reproduces 
the thermodynamics of small proteins remains to be seen. One thing that probably 
will be necessary in order to achieve this goal is to include multibody effects in the 
hydrophobicity potential; the present pairwise additive potential is likely to become 
insufficient as the chains get larger. Computationally, there is room for extending 
the calculations to larger chains; the calculations presented here required about two 
weeks on a standard desktop computer for each peptide. 

Acknowledgments: We thank Luis Serrano and Manuela Lopez de la Paz for pro- 
viding NMR data for LLM and Betanova. This work was in part supported by the 
Swedish Foundation for Strategic Research and the Swedish Research Council. 
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